The IBM LVCSR System Used for 1998 Mandarin Broadcast News Transcription Evaluation

نویسندگان

  • XueFeng Guo
  • WeiBin Zhu
  • Qin Shi
  • Scott Chen
  • Ramesh Gopinath
چکیده

This paper presents the technologies implemented in the IBM's Large Vocabulary Continuous Speech Recognition(LVCSR) system which was designed for 1998 Mandarin broadcast news transcription evaluation task. Compared with the 1997 system, it focuses on acoustic improvements by implementing several new schemes such as LDA and MLLT transformation matrix, BIC model selection criterion, SAT and CAT models. In addition, new language model components and new vocabulary were built. Some other schemes which were tried we also described.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dragon Systems’ 1998 Broadcast News Transcription System for ̋mandarin

In this paper we shall describe Dragon Systems’ 1998 Broadcast News transcription system for Mandarin. We shall describe our music classifier, which was unique to our Mandarin system, as well as our speaker change detection algorithm, ̋ which was used in our English and Mandarin systems. We shall also report on preliminary, post-evaluation experiments with pitch.

متن کامل

Transcription of broadcast news-some recent improvements to IBM's LVCSR system

This paper describes extensions and improvements to IBM’s large vocabulary continuous speech recognition (LVCSR) system for transcription of broadcast news. The recognizer uses an additional 35 hours of training data over the one used in the 1996 Hub4 evaluation [?]. It includes a number of new features: optimal feature space for acoustic modeling (in training and/or testing), filler-word model...

متن کامل

Development of Cslu Lvcsr: the 1997 Darpa Hub4 Evaluation System

This paper presents the CSLU Broadcast News transcription system used in the DARPA 1997 evaluation. The system was built using the softwares developed for the CSLU LVCSR project started in January 1997. This 25K-word vocabulary system used continuous HMMs for acoustic modeling and the standard backo trigram as the language model. The search used a single pass decoder with MLLR based adaptation ...

متن کامل

A Broadcast News Corpus for Evaluation and Tuning of German LVCSR Systems

Transcription of broadcast news is an interesting and challenging application for large-vocabulary continuous speech recognition (LVCSR). We present in detail the structure of a manually segmented and annotated corpus including over 160 hours of German broadcast news, and propose it as an evaluation framework of LVCSR systems. We show our own experimental results on the corpus, achieved with a ...

متن کامل

Toward automatic transcription of Japanese broadcast news

In this paper, we report on the automatic recognition of Japanese broadcast-news speech. We have been working on largevocabulary continuous speech recognition (LVCSR) for Japanese newspaper speech transcription and have achieved good performance. We have recently applied our LVCSR system to transcribing Japanese broadcast-news speech. We extended the vocabulary from 7k words to 20k words and tr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999